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Abstract 

Rhizobia, the bacterial legume symbionts able to fix atmospheric nitrogen inside root nodules, have to 
survive in varied environmental conditions. The aim of this study was to analyse the transcriptional response 
to heat shock of Mesorhizobium loti MAFF303099, a rhizobium with a large multipartite genome of 7.6 Mb 
that nodulates the model legume Lotus japonicus. Using microarray analysis, extensive transcriptomic 
changes were detected in response to heat shock: 30% of the protein-coding genes were differentially 
expressed (2067 genes in the chromosome, 62 in pMLa and 57 in pMLb). The highest-induced genes are 
in the same operon and code for two sHSP. Only one of the five groEL genes in MAFF303099 genome was 
induced by heat shock. Unlike other prokaryotes, the transcriptional response of this Mesorhizobium 
included the underexpression of an unusually large number of genes (72% of the differentially expressed 
genes). This extensive down regulation of gene expression may be an important part of the heat shock re- 
sponse, as a way of reducing energetic costs under stress. To our knowledge, this study reports the heat 
shock response of the largest prokaryote genome analysed so far, representing an important contribution 
to understand the response of plant-interacting bacteria to challenging environmental conditions. 
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1. Introduction 

Rhizobia are soil bacteria able to colonize legume 
roots and form nodules, where atmospheric nitrogen 
is metabolized into compounds that can be used by 
the plant. The impact of the biological nitrogen fixation 
carried out by rhizobia in agriculture is both economic 
and environmental. Rhizobia may reduce the use of 
chemical N -fertilizers, which represent a production 
cost reduction and at the same time a decrease in the 
pollution resulting from N-fertilizers synthesis and 
from soil nitrate lixiviation. 1 

Rhizobia typically have large genomes, which are 
often composed by several replicons. These seem to 



be common features of bacterial species that interact 
with a host. 2 This rhizobial trend to harbour a large ac- 
cessory genome is probably related, not only to the 
symbiosis itself (interacting with a host), but also to 
the plasticity required tosurvive in complexand distinct 
environments. As free-living bacteria, rhizobia have to 
cope with changes in soil conditions and as plant- 
symbionts, rhizobia must overcome plant defence 
mechanisms and adapt to the intracellular nodule en- 
vironment. For all the above reasons, these bacteria 
are particularly interesting to study stress response. 
The most important consequences of heat stress at 
the cellular level are protein denaturation and aggrega- 
tion. 3 These effects are common to other adverse 
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conditions, as for example oxidative stress, so the study 
of the heat stress response is also relevant in under- 
standing tolerance to other stresses. 

The plasticity to respond to stressful conditions 
involves rapid changes in gene expression. Alternative 
sigma factors allow bacteria to rapidly redirect the 
RNA polymerases pool to the set of genes that are 
required to respond to a certain condition. 4 Rhizobia 
genomes typically harbour a large number of alterna- 
tive sigma factors, including multiple copies of rpoH, 
which encodes ct 32 , the major sigma factor involved in 
the heat shock response. 5 <r 32 might be involved in re- 
sponse to other stresses as seen in Rhizobium etli, 
where rpoH2 seems to be more related to oxidative 
stress response. 6 Furthermore, rhizobia with rpoH dele- 
tions may also be affected in their symbiotic pheno- 
type. 6,7 The transcription of ~21% of the genes 
induced in response to a temperature upshift are 
rpoH1 dependent in Sinorbizobium meliloti and these 
include chaperones, proteases and small heat shock 
proteins (sHSP). 5 

Important chaperone systems, such as GroES-GroEL 
and DnaK-DnaJ-GrpE, are a 32 -regulated in most alpha- 
proteobacteria. Chaperones play a key role in the heat 
shock response, as they are involved in promoting the 
acquisition of the native conformation by proteins 
that suffered denaturation and present the wrong 
folding. 8 The importance of chaperonins in defining 
tolerance to temperature has been highlighted by 
several studies in £. co//. 9,10 A more recent study 
showed that a high level oftheGroESL system hasafun- 
damental role in the evolution of heat tolerance. 11 
Some important reports on the functional analysis of 
the multiple groESL operons in rhizobia have been pub- 
lished. 12-14 Mutational studies showed that groESL 
operons within the same genome are induced by differ- 
entstimuli and thatthese genes are involved notonly in 
stress tolerance, but also in the nodulation and nitrogen 
fixation processes. 15 In Mesorbizobium spp., both dnaK 
and groESL genes were reported to be transcriptionally 
induced by a temperature upshift, especially in heat 
tolerant isolates. 16 In rhizobia, groESL operons are 
often CIRCE (controlling inverted repeat of chaperone 
expression) regulated, as already reported in 
Bradyrbizobium japonicum, S. meliloti and Rbizobium 
leguminosarum. )3 ' w ^ 8 CIRCE is a highly conserved 
DNA sequence that serves as binding site of the repres- 
sor protein HrcA. 1 9,20 

Similar to the GroESL chaperonins, also the DnaKJ 
system seems to be involved in both heat tolerance 
and symbiosis phenotype. 21 ~ 23 Regarding the co- 
chaperone dnaj, rhizobia mutants showed that both 
stress tolerance and symbiotic performance are 
affected. 21 - 22 ' 24 

sHSP are mostly involved in preventing the irrevers- 
ible aggregation of misfolded proteins. The presence 



of a large numberof sHSP is a common feature in rhizo- 
bia genomes. 25 Some sHSP have a specific regulation 
designated by repression of heat shock gene expression 
(ROSE). ROSE element is a posttranscriptional regula- 
tion mechanism that consists in a conserved sequence 
downstream to the promoter. 26 

The heat shock response has been extensively studied 
in bacteria, howeverto our knowledge, only one rhizobia 
strain was studied in terms of heat shock transcriptome, 
namely S. meliloti 1 021 ,asymbiontof Medicago spp. 5,27 
The strain analysed inthe present report, Mesorbizobium 
/ot/MAFF303099, is a rhizobium able to establish nitro- 
gen-fixing symbiosis with Lotus species. 28,29 M. loti 
MAFF303099 genome comprises a large chromosome 
(7 Mb) and two plasmids designated as pMLa (352 kb) 
and pMLb (208 kb). A chromosomal symbiosis island 
(61 0 kb) contains most genes involved in nodulation 
and nitrogen fixation. A previous study showed that this 
strain is tolerant to heat shock and cold conditions, and 
grows well at pH 5. 30 

The aim of the present study is to characterize the 
transcriptional response to heat shock in a resourceful 
rhizobium with a large and complex genome. The ana- 
lysis of the global transcriptional alterations following a 
sudden exposure to high-temperature conditions in M. 
loti MAFF303099 will contribute to a better under- 
standing of the general stress response, in particular in 
symbiotic bacteria with multiple repliconsand large ac- 
cessory genome. 

2. Materials and methods 

2.1 . RNA purification 

Overnight cultures of M. loti MAFF303099 were 
grown in YMB 31 at 28°C, to a final optical density of 
0.3 (540 nm). A volume of 1 0 ml of bacterial culture 
was used in each treatment: 30 min at control (28°C) 
and heat shock (48°C) conditions. Cells were harvested 
and total RNA was purified using RNeasy Mini Kit 
(Qiagen). Contamination with DNA was removed by 
DNase digestion (Roche), followed by RNA cleanup 
using RNeasy Mini kit (Qiagen). Total RNA integrity 
was checked using the RNA Nano kit and an Agilent 
2100 Bioanalyser (Agilent Technologies), while RNA 
quantification was performed using NanoDrop ND- 
1000 (NanoDrop Technologies). RNA was prepared 
from three independent cell cultures. 

2.2. Microarray experiments 

RNA processing as well as microarrays hybridization 
and raw data extraction were a service provided by 
Biocant Park— Genomics Unit (Portugal). In order to 
enrich the RNA samples in mRNA, the MICROB 
Express™ Kit (Ambion) was used to remove most of 
the rRNA mRNA was then amplified with the 
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MessageAmp™ ll-Bacteria Kit (Ambion),with the incorp- 
oration of 5-(3-aminoallyl)-UTP(Ambion) for indirect la- 
belling, which was carried out by the coupling of 
fluorescent Cy3 to the amplified RNA (aRNA), following 
the instructions of the Amino Allyl MessageAmp™ II 
aRNA Amplification Kit (Ambion). 

The 40 K array for M. /ot/MAFF303099 (MYcroarray) 
includes probes for 7231 genes (~99% of the total 
number of protein-coding genes) with five replicates 
for each probe. Slide hybridization was carried out as 
described by the microarray's supplier, using the Gene 
Expression Hybridization Kit (Agilent Technologies). 
Data were acquired using a DNA Microarray B Scanner 
(Agilent Technologies), with an intensity of 1 00% PTM 
in the green channel. 

2.3. Data analysis 

The microarrays data were analysed using BRB 
ArrayTools (version 4. 2). 32 The arrays were normalized 
using the array median and genes that we re differential- 
ly expressed following heat shock were identified using 
MeV software. 33 Genes were considered differentially 
expressed for P < 0.01 in the f-test. 

Despite the recent update on the annotation of the 
MAFF303099 genome released by NCBI (October 
2012), all genes differentially expressed that were 
annotated as 'hypothetical protein' were further ana- 
lysed using Blast2GO software. 34 This analysis included 
Blast, Mapping and An notation, and allowed further an- 
notation of many genes. In order to assign the highest 
number possible of genes to a clusters of orthologous 
genes (COG) category, STRING 9.0 database (search 
tool for the retrieval of interacting genes) 35 was used. 

MicrobesOnline Operon Predictions (www.micro- 
besonline.org/operons/) was used for operon predic- 
tion. 36 The identification of putative promoter 
sequences was performed using BPROM-Prediction of 
bacterial promoters software (www.softberry.com). 
DNAPlotter 37 was used to generate circular DNA maps 
showing transcriptomics data. 

Spearman's coefficient was used to test for correl- 
ation between genome size and number of over- or 
underexpressed genes (IBM SPSS Statistics, version 2 1 ). 

2.4. Microarray data validation 

Validation of microarray data was performed by real- 
time quantitative RT-PCR (qRT-PCR). cDNA was 
obtained by reverse transcription using Maxima First 
Strand cDNA Synthesis kit (Thermo Scientific) accord- 
ing to the manufacturer's instructions. Primers 
(Supplementary Table S1 ) were designed using Primer 
Express 3.0 software (Applied Biosystems). Real-time 
qRT-PCR reactions were prepared using 0.1 ng/|xl of 
template cDNA, SYBR Green PCR Master Mix and 
0.3 mM of each primer. Amplifications were carried 



out in a 7 500 Real-time PCR System (Applied 
Biosystems). C t values for the target genes were normal- 
ized using the reference genes hisQrpoA and sigA, which 
showed no variation in the corresponding transcript 
levels for the experimental conditions used (data not 
shown). 

3. Results and discussion 

3.1 . Global transcriptional response 

Analysis of the M. lot! MAFF303099 transcriptome 
allowed the identification of 2186 protein-coding 
genes that were differentially expressed after heat 
shock (out of 72 31 genes analysed), with an average 
false discovery rate of 1.5% (accession number 
GSE43 529). This indicates that the transcript levels of 
~30% of the protein-coding genes were altered by 
this stress. The transcriptional response included a 
much higher number of downregulated (1 584) com- 
pared with the upregulated (602) genes (Fig. 1). The 
unexpected larger proportion of downregulated genes 
does not seem to be a feature of rhizobia, taking into 
account the similar numbers of induced and repressed 
genes reported forS. meliloti. 5,27 

To our knowledge, the present study reports the 
largest prokaryote genome studied so far in terms of re- 
sponse to heat shock. To investigate the influence of 
genome size in the global heat response, a comparison 
of the transcriptional response to heat of prokaryotes 
with different genome sizes was performed (Fig. 2). 
Strain MAFF303099 shows an unusual proportion 
of downregulated genes in response to heat shock 
compared with several other bacteria and archaea 
that, in general, show a similar number of genes 
under- and overexpressed following a temperature 
upshift (though different heat shock conditions are 
compared). Despite the fact that diverse species with 
distinct lifestyles and subjected to different heat shock 
conditions are compared in Fig. 2, analysis of the tran- 
scriptomicdata suggests a general trend of pronounced 
increase in the number of downregulated genes with 
genome size. One might speculate that many expend- 
able genes are shutdown, so that the cellular machinery 
can be more effective in the synthesis of the specific 
functional response. Nevertheless, the extensive gene 
downregulation is not particularly detected in the ac- 
cessory genome that is presumably more dispensable. 
Indeed, the symbiosis island shows dispersed under- 
and overexpressed genes similar to the rest of the 
chromosome (Fig. 3). Furthermore, some highly 
induced genes are plasmid encoded, mainly in pMLb 
(Fig. 4). This is somewhat unexpected since symbiosis 
islands and plasmids are mobile elements in the 
genome, known to be laterally transferred within soil 
populations and thus less expected to carry genes 
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Figure 1. Microarrays analysis of M. loti MAFF303099 subjected to heat shock. A/l-values for the differentially expressed genes (P< 0.01 ) 
obtained from the comparison between heat shock (48°C) and control (28°C) conditions. Genes with increased amount of mRNA 
following the heat shock have positive A4-values (overexpressed), while genes with decreased mRNA levels after heat shock show negative 
A/l-values (underexpressed). 
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Figure 2. Number of overexpressed (+) and underexpressed (O) 
genes resulting from the transcriptome studies of the response to 
heat shock of 1 8 species of Bacteria and Archaea plotted against 
their genome size. Trendlines are shown in grey for the number 
of overexpressed genes (R 2 = 0.35; Spearman's p= 0.583, P< 
0.05) and in black for the number of underexpressed genes (R 2 = 
0.69; Spearman's p = 0.608, P < 0.01 ). From the smallest to the 
largest genome size: Mycoplasma hyopneumoniae 38 ; Tropheryma 
whip-plei 39 ; Rickettsia prowazekii 40 ; Campylobacter jejuni 4 '; Strep- 
tococcus thermophilus 42 ; Achaeoglobus fulgidus 43 ; Bifidobacterium 
longum 44 ; Xylella fastidiosa 45 ; Listeria monocytogenes 46 ; Acidithio- 
bacillus ferrooxidans 47 ; Corynebacterium glutamicum 4& ; Desul- 
fovibrio vulgaris 49 ; Clostridium difficile 50 ; Escherichia co/;' 51 ; 
Methanosarcina barkeri 52 ; Shewanella oneidensis 53 ; S. meliloti 5 ; M. 
loti (this study). The two rhizobia species are denoted in the 
graphic. Note: in case of multiple heat shock transcriptome 
datasets for the same species, the dataset with the largest number 
of differentially expressed genes was chosen. 



essential for stress survival. In addition, the set of 1 00 
genes with highest M-values comprises 14 plasmid 
encoded genes, while the 1 00 highly underexpressed 
genes are all chromosomal (Supplementary Table S2). 

The high number of underexpressed genes may 
suggest that the heat shock response relies on a low- 
energy transcriptional response. Accordingly, ~40% 
of the induced genes show a low increase in the tran- 
scriptional levels (M < 1 ). This low level of gene induc- 
tion, commonly disregarded, may be important part of 
cells response, as pointed before by Wren and 
Conway. 54 

Analysis of the location of the differentially expressed 
genes in each replicon shows an apparently random dis- 
tribution of over- and underexpressed genes, with the 
exception of an~200 kb-long region located in 
1 000 000-1 200 000 (462 genes) where all the dif- 
ferentially expressed genes are downregulated (Fig. 3). 
Both in the chromosome and plasmids, distribution of 
thedifferentiallyexpressed genes seemsto be unrelated 
to the DNA strand or GC content. 

Real-time qRT-PCR was used to validate the micro- 
array data. Genes were chosen based on M-values 
from the microarrays results, in order to include 
overexpressed, underexpressed and not differentially 
expressed genes, as well as genes encoded in both 
DNA strands and scattered in the chromosome. In 
general, the results from the real-time qRT-PCR experi- 
ments are in agreement with the microarrays analysis 
results (Table 1 ), with exception of the dnaKgene (dis- 
cussed in the section The DnaKJ chaperone system'). 
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Figure 3. Circular plots of the chromosome and two plasmids 
included in M. loti MAFF303099 genome showing, from outer 
to inner rings: COG group for each gene; the heat shock 
transcriptome data (A4-values) and the %GC plot. The plasmids 
plots include two additional outer rings displaying the genes 
encoded in the plus strand (outermost ring) and minus strand. 



Protein -coding genes can be grouped into COG, 
accordingtotheirsimilarity in termsof domain architec- 
ture and function. 55 The present study showed that tem- 
perature stress-induced changes in the expression of 
genes belonging to all COG categories from the 
MAFF303099 genome (Fig. 5). For all COG categories, 
the percentage of underexpressed genes is higher than 
that of overexpressed genes (Fig. 5 and Supplementary 
Fig. SI ). In addition to the fact that a high number of dif- 
ferentially expressed genes are not in a COG (1 580 
genes), there are also many poorly characterized genes 
('S— function unknown' and 'R— general function predic- 
tion only' categories) (Supplementary Fig. S1 ). 

The COG category with the highest percentage of 
overexpressed genes is 'L— replication, recombination 
and repair' (9%). This COG category also shows the 
lowest percentage of underexpressed genes (1 2%). 
Nevertheless, the percentage of overexpressed genes is 
between 7 and 8% in nine other categories, including 
the COG category where chaperones and other heat 
shock proteins are included ('O— posttranslational 
modification, protein turnover and chaperones'). This 
suggests a balanced response in terms of gene induction 
throughout the COG categories; yet, ~1 3% of the overex- 
pressed genes are not in a COG. Three categories include 
a high percentage of underexpressed genes following a 
heat shock, namely 'D— cell cycle control, cell division, 
chromosome partitioning', 'F— nucleotide transport 
and metabolism' and 'N— cell motility' (53, 48 and 
44%, respectively). COG categories with a high number 
of overexpressed genes are 'K— transcription', 'G — carbo- 
hydrate transport and metabolism' and 'E— amino acid 
transport and metabolism' (Supplementary Fig. S1A). 
On the other hand, COG categories E and G also show a 
high number of underexpressed genes (Supplementary 
Fig. S1 B). This is consistent with other bacterial species 
for which these two COG categories also showed a high 
numberof over- and underexpressed genes in response 
to heat shock. 46,48 According to Konstantinidis and 
Tiedje, 56 large genomes tend to have a disproportional 
increase of genes belonging to COG 'K— transcription', 
T— signal transduction mechanisms' and 'Q— secondary 
metabolites biosynthesis, transport and catabolism', 
which could be expected to be the most underexpressed 
categories in large genome bacteria, nevertheless that is 
not observed in MAFF303099 (Fig. 5 and Supple- 
mentary Fig. S1 B). 



COG colours: information storage and processing — blue; cellular 
processes and signalling — green; metabolism — magenta; poorly 
characterized — yellow; more than one COG category — brown; no 
COG — light grey. Transcriptome data: overexpressed — black; 
underexpressed — grey. %GC data: above average — dark red; below 
average — orange. The symbiosis island (coordinates 4 644 792- 
5 255 766) 28 is marked in blue in the chromosome plot. This 
figure appears in colour in the online version of DNA Research. 
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Figure 4. Number and location of differentially expressed genes in 
M. /ot; MAFF303099,followingthe heat shock. 



Table 1. Microarrays data validation using real-time qRT-PCR 



Locus tag 


Gene 


A/I-value 

Real-time qRT-PCR 


Microarrays 


mll2386 




14.6 


6.6 


mlr2394 


groEL 


1 1.9 


5.8 


mill 528 




4.6 


4.7 


mll3429 


dpB 


7.0 


2.9 


mll3842 


citZ 


6.1 


2.2 


mlr5932 


acdS 


1.5 


1.1 


mll3873 




-0.6 


-1.9 


mlr0883 


gcvT 


-0.9 


-2.2 


mlr61 1 8 




-2.4 


-2.7 


mill 546 


ftsZ 


-3.4 


-3.7 


mll6630 




-3.8 


-4.0 


mlr291 1 


flgB 


-3.7 


-4.3 


mll6578 


fixK 


-4.0 


-5.1 


mll6432 




0.3 


nde 


mll4757 


dnaK 


5.5 


nde 


mll4755 


dnaj 


-0.2 


nde 


mlr761 8 


greA 


0.9 


nde 



nde, not differentially expressed. 

3.2. Small heat shock proteins 

The two most heat shock-induced genes (mll2387 
and mll2386 with M-values of 6.61 and 6.32, respect- 
ively) code for sHSP (Table 2). These genes are probably 
co-transcribed, since a single putative promoter was 
identified upstream mll2387 (predicted promoter: 
-35 TTGACG and -10 ACTCATTCT). This particular 
sHSP operon is likely to play an important role in the 
heat shock response, since homologous genes were 
also detected as the most overexpressed inS. meliloti fol- 
lowing a less severe heat shock. 5 Following a longer heat 
exposure, these genes seem to be less overexpressed, yet 
showing an induction of approximately 4-fold. 27 The 
homologous ibpAB are also the most-induced genes in 
the heat shock response of £. co//. 5 ' Western analysis of 
protein extracts of several rhizobia species confirmed 
an increase of the amount of sHPS with temperature 
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upshifts. 25 As in many other bacteria species, in M. loti, 
S. meliloti and E. coli, a ROSE element was identified up- 
stream of these operons 26 (Supplementary Fig. S2A). 
Nevertheless, these sHSP genes were reported as rpoH- 
dependent inS. meliloti, 5 suggesting multiple regulation 
mechanisms that may allow a dynamic stress response. 

Rhizobia genomes carry a large number of sHSP. 25 
Strain MAFF303099 has eight genes identified as 
sHSP, from which four were highly induced by heat 
shock (mll2387, mll2386, mll9627 and mll3033), 
two remained unaltered (mll2257 and mlr31 92) and 
two were underexpressed (mlr4720 and mlr4721). 
sHSP can be divided into two classes in terms of se- 
quence: class A includes sHSP similar to E. coli IbpAB, 
while sHSP grouped in class B are more divergent in 
terms of sequence. 25 Gene mll2387 belongs to class 
A, while mll2386 is more divergent and considered a 
class B sHSP. 57 According to Studer & Narberhaus 58 it 
is improbable that mll2386 and mll2387 could form 
hetero-oligomers even if co-expressed, since in B.japo- 
nicum hetero-oligomers only occurred between sHSP 
from the same class. All class A sHSP from M. loti 
MAFF303099 (mll2387, mll3033, mlr3192 and 
mll962 7-plasmid encoded) showed a ROSE element 
downstream to the promoter, which would confer 
high-temperature sensitivity to the transcription of 
these genes 26 (Supplementary Fig. S2A). However, 
one of these sHSP was not overexpressed following the 
heat shock tested (mll3 1 92-hspH), despite the fact 
that its B. japonicum homolog, also regulated by a 
ROSE element, is heat inducible. 25 

3.3. GroESL chaperone system 

Similar to other heat shock related genes, rhizobia 
genomes harbour several copies of the groESL operon, 
usually with different regulation mechanisms and ex- 
pression kinetics.' 5 M. loti MAFF303099 has four 
groESL operons in the chromosome and one in pMLa. 
From these five operons, only one appears to be 
involved in heat shock response, namely the groEL 
gene mlr2394, which was strongly overexpressed after 
heat shock exposure (M = 5.79). This groEL gene is 
highly similar to groEL5 and groELI from S. meliloti 
(87 and 83% amino acid identity, respectively), which 
are the most heat shock-inducible copies in that 

5 27 

species. ' 

From what is known from other rhizobia genera, only 
some groESL operons encoded in the same genome are 
heat inducible and those can be regulated either by the 
o- 32 or by CIRCE element.' 2,1 3,59 In the case of 
MAFF303099, a CIRCE element was found upstream 
all groESL operons (Supplementary Fig. S2B). The 
same exact consensus sequence of this inverted repeat 
is found in three operons and the remaining two 
operons differ in only two positions. The overexpressed 
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■ Overexpressed ■ Underexpressed Not differentially expressed 

Figure 5. Percentage of genes from each COG category overexpressed and underexpressed after the heat shock. Genes not in a COG are also 
shown. The number of genes included in each category is shown at the right end of the graphic. 



groEL gene belongs to one of the operons regulated by a 
slightly divergent CIRCE element. Our results suggest 
that the presence of a CIRCE consensus sequence does 
not ensure a highly efficient induction under heat 
stress conditions. A similar situation was detected in R. 
leguminosarum, where a putative CIRCE element was 
found upstream of all three groESL operons, and 
further analysis of this regulation mechanism showed 
that the most heat-inducible operon was indeed 
CIRCE regulated, but a second operon, less induced by 
heat, was not affected by CIRCE deletion or hrcA knock- 
out. 1 7 This second operon was rpoH regulated, suggest- 
ing an overlapping of regulation mechanisms. 1 7 

Despite the high /Vl-value detected for the groEL 
mlr2394, the expression of the groES gene in the same 
operon (mlr2393) following the heat shock remained 
unaltered. Similarly, in S. meliloti, the gene SMb22023 
(groES5) was not induced by heat shock, despite the 
high induction of the corresponding groEL5 gene 
(SMb21 566). 5,27 No promoter could be identified in 
the 59 bp groES-groEL intergenic space using BProm, 
so a bicistronicmRNA should be synthesized. A posttran- 
scriptional cleavage could explain why only the tran- 
script of the second gene in the operon is highly 
abundant. A cleavage event occurs in the groESL tran- 
script of Agrobacterium tumefaciens, explaining why the 
transcript corresponding to groEL alone is the abundant 



mRNA detected after heat shock. 60 Analysing the inter- 
genic space in the MAFF303099 groES-groEL operon, 
using the 'KineFold Webserver', 61 a stem-loop structure 
was found, though weaker than the one described to 
undergo cleavage in A. tumefaciens (data not shown). 
GroES-GroEL complexes comprising proteins encoded 
by different operons tend to be less efficient than 
the chaperonins complexes encoded by the same 
operon. 62 However, the predominant GroES-GroEL 
complex consists of a single 10 kDa-heptameric ring 
(GroES) plus two rings of seven 60 kDa-monomers 
(GroEL), sothe ratio between thetwo is 1 :2, which iscon- 
sistentwith a lower groES transcription. 



3.4. DnaKJ chaperone system 

The role of the DnaKJ chaperone system in stress re- 
sponse is well known in other bacteria; however, few 
studies address these heat shock proteins in rhizobia. In 
the present study, dnaK (mll4757) and the co-chaper- 
one dnaj (mll4755) were not found to be significantly 
heat shock induced. Nevertheless, the real-time qRT- 
PCR results (Table 1) show that dnaK was induced 
by heat shock, agreeing with previous studies in 
Mesorhizobium} b Approximately, 2-fold induction of 
the dnaK gene was detected in S. meliloti cells exposed 
to 40°C for 30 min, 5 while no induction was reported 
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Table 2. 


Overexpressed genes following 


;the heat shock, identified by microarray analysis. 




Locus tag 


Replicon 


COG category 3 


Gene description 


A/I-value 


mll2386 


Chr 


n 




6.61 


mll2387 


Chr 


n 


cWQP 


6.32 


mll3685 


Chr 




i Ki_-udiici uoindin-C-Oiiidining protein 


6.08 


msl2054 


Chr 




1— Iwnrtthpt ir^ I nrntp i n 
n y JJtJLI IC LICd 1 JJIULCIIl 


5.84 


mll1959 


Chr 




DA 1 /fl \s TGinilv nrntpin 
DA 1 tK Id 1 III ly pi (J LCI II 


5.82 


mlr2394 


Chr 


o 


MoIppi t 1 3 r rha np rnnp-n mF 1 

fVlLIICLUICil 1 I Ci IJ\Z I KJ 1 1 C UIULL 


5.79 


mll7465 


Chr 


Q 


A\D\^ LI d 1 lb|JLJI LCI ]JC 1 1 1 led be 


5.79 


msr9689 


pMLb 




ny (J(JLI ItrLILd 1 JJIULCIM 


5.62 


msr8048 


Chr 




ny (J(JLI ItrLILd 1 (JIULclll 


5.61 


msl2390 


Chr 




1 lew fa tn i lv nrnfpin 
USt Idllllly pit-)LCIII 


5.54 


msl1 808 


Chr 




ny JJULI ICLICd 1 JJIULCIM 


5.49 


mlr4836 


Chr 


HC 


EArt.liin/lin(Y n mf pm 
lYlUlllMJAYzlClldaC wr\U LllllUIIlg pi tJ LCI II 


5.27 


mlr21 58 


Chr 


jK 


Metallo-beta-lactamase superfamily protein 


5.25 


mlr2234 


Chr 




1— Ivnothptir^ 1 nrntPin 

1 1 y UULI IC LICd 1 L/1CLCIII 


5.25 


mll9627 


pMLb 


u 


c UICD 
SMor 


5.23 


mlr21 60 


Chr 


n 
K 


Transporter component 


5.1 7 


mlr51 53 


Chr 




Transmembrane protein 


5.09 


mll9357 


pMLa 


c 


Domain-containing protein 


5.03 


mll3694 


Chr 


T 
I 


1 I d 1 IbCI 1 [JLIUI Id 1 ICgUldLUI 


4.98 


mll1952 


Chr 


c 


INUI bUIUI II 1 IL dLIU IcUUCLdbC 


4.98 


mll4827 


Chr 


J 


F nrlrtfiltrtn iirla^co 1 .PCP 
EJlUUriUUIlUUcdaC L I 31 


4.89 


mlr21 59 


Chr 


1/ 
l\ 


Transcriptional regulator 


4.86 


msr861 5 


Chr 


K 


irdiispuricr ct-MiipuiiciiL 


4.82 


msl3831 


Chr 




Conserved hypothetical transmembrane protein 


4.79 


mlr9581 


pMLb 




PRC-barrel protein 


4.78 


mll4607 


Chr 


s 


l\U UIULCIll 


4.66 


mill 528 


Chr 




3Illdll IIILCiLldl II ICII 11? I dl IC piULCIIl 


4.65 


mlr8230 


Chr 




Tra ncrn rvt" inn a 1 rptri 1 1 atnr 
1 1 d 1 IbCI 1 |J LIUI Id 1 ICgUld LU I 


4.56 


mlr0408 


Chr 




Trsncmpinnranp □ntuciffinn fart/^K 
1 1 dllalllCIIIUI dllC dllLI algllld IdlLUI 


4.56 


msr2497 


Chr 


g 


1— K/nnt hot ir^ r\m+P i n 
Pi y JJULI IC LICd 1 JJIULCIM 


4.47 


mll8293 


Chr 




n y JJULI IC LICd 1 JJIULCIM 


4.38 


msl2212 


Chr 




rdlllliy LI d 1 Isd 1 pLIUI Id 1 ICgUldLUI 


4.36 


msl7604 


Chr 




PI y JJULI IC LICd 1 JJIULCIM 


4.31 


ms!7943 


Chr 




Hypothetical protein 


4.28 


ms!9358 


pMLa 




Tt" □ncrrmtinti furf n t" 
1 1 dl I3LI lUUfJII IdLLUI 


4.27 


mlr21 25 


Chr 




1— Iwnrtthpt if a 1 nrntp i n 
D y JJULI IC LICd 1 JJIULCIM 


4.27 


mlr3707 


Chr 




Hypothetical protein 


4.25 


mlr0407 


Chr 




DMA nnlumoracp c i o m Tzsr~fi~\r 
txIN/A LJUiyillCldbC blgllld IdLLUI 


4.1 6 


mll3692 


Chr 




1— l\/nrvi"hpf ipa 1 nrntPin 
n y JJULI IC LICd 1 JJIULCIM 


4.14 


msr8675 


Chr 




1— K/nrvi'hpf \rzi 1 nrntPin 
n y JJULI IC LICd 1 JJIULCIM 


4.1 1 


mlr3233 


Chr 


M 
IN 


Host attachment protein 


4.07 


msr431 7 


Chr 




n y JJULI IC LICd 1 JJIULCIM 


4.04 


mll2066 


Chr 


D 


Mobile mystery protein b 


4.02 


mll6953 


Chr 


R 


Domain-containing protein 


4.02 


mll6858 


Chr 


RIQ 


Short chain dehydrogenase 


3.98 


mlr1 797 


Chr 


S 


Conserved domain protein 


3.91 
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Locus tag 


Replicon 


COG category 3 


Gene description 


M-value 


mll3445 


Chr 


C 


Luciferase-like protein 


3.91 


mll221 1 


Chr 


C 


Morphinone reductase 


3.89 


msl6857 


Chr 


R 


Hypothetical protein 


3.88 


mll81 79 


Chr 


S 


Family protein 


3.83 



The 50 genes with the highest A/!-values are shown. Gene descriptions shown in bold resulted from sequence analysis using 
Blast2GO software. 

a COG category letters according to NCBI functional categories (http://www.ncbi.nlm.nih.gov/COG/grace/fiew.cgi). 



for a shorter heat shock (42°C for 1 5 min). 27 In the 
microarray analysis, the changes in expression levels of 
the dnaK gene were considered not statistically signifi- 
cant due to discrepancies among replicates. 

It was reported for several rhizobia species that dnaj 
deletions cause reduced growth at high tempera- 
tures; 21,24 however, no transcriptional activation fol- 
lowing a heat shock was detected in the present study 
or in other studies with S. meliloti. 5 ' 27 Similar to our 
results, no induction of grpE was reported forS. meliloti 
by Sauviac and coworkers, 27 while a different study 
showed induction of grpE by heat shock. 5 

Another heat shock protein that has a close inter- 
action with the DnaKJ system is ClpB. The dpB gene 
(mll3429) was found to be overexpressed in the 
present study, with an A/I-value of 2.93. The dpB gene 
was already seen to be upregulated following a heat 
shock in S. meliloti 5,27 and the importance of ClpB in 
rhizobia stress response, especially to heat shock, was 
also previously reported. 63 Similar to E. coli, the knock- 
out of the cIpB gene in Mesorhizobium ciceri led to an in- 
ability to endure high temperatures. Furthermore, in M. 
ciceri the symbiotic performance was also negatively 
affected. 63,64 These results are consistent with the 
ClpB role in denatured protein disaggregation, namely 
by its cooperation with the DnaKJ system. 65 

3.5. Sigma factors 

Rhizobia usually have multiplecopiesof genesencod- 
ing the same sigma factors, for example rpoH and rpoE. 
TheM. /ot/MAFF303099 genome includes25 putative 
sigma factors, from which four were induced by heat 
shock (mlr0407, mll3697, mll8140 and mlr3807). 
None of these sigma factor-encoding genes is com- 
pletely annotated; nevertheless, BLAST analysis 
showed that loci mlr0407 (highly induced) and 
mll8140 are similar to both o- 70 and o- 24 , and 
mlr3807 is more similar to o- 24 , while mll3697 shows 
high similarity to the S. meliloti rpoE2 gene (76%). 
Sauviac and collaborators 27 suggested RpoE2 as the 
major global regulator of stress response in S. meliloti, 
despitethefactthat nophenotypechangewasdetected 
in the rpoE2 mutant. Our results are consistent with 



that suggestion, since mll3697 is overexpressed in 
heat shock conditions with an M-value of 2.4. The 
gene mll2 869 encoding o- 70 was found to be underex- 
pressed following heat shock conditions, which may 
contribute to the extensive down regulation detected 
in MAFF303099 transcriptional response. 

Sigma factors typically related to the heat shock re- 
sponse, as o" 3 2 (rpoH) and o- 24 (jpoE) that probably are 
encoded by mlr3741 and mlr8088 in MAFF303099, 
were not affected at the transcriptional level by the 
heat shock conditions applied. The gene rpoH2 
(mlr3862) was also not induced in the conditions 
used in this study. Similarly, Martinez-Salazar and cow- 
orkers 6 reported that none of the rpoH genes were 
induced by heat shock in R. etli. Nevertheless, rpoH 
mutants are usually impaired in their stress tolerance 
phenotype, as is the case for S. meliloti and R. etli. 6,66 
rpoH1 controls the expression of ~21% of the heat 
shock-induced genes in S. meliloti and is also related 
to oxidative stress response, while rpoH2 seems to 
play a minor role in the heat shock response and is 
more involved in osmotic tolerance. 5,6 

In £. coli, the rpoH regulation seems to be more at the 
protein level than at the transcriptional level. This 
control hypothesis is known as the 'unfolded protein ti- 
tration model'and involvesthe most importantchaper- 
one systems: under normal growth conditions, o- 32 
binds to DnaKJ and GroESL so it becomes unavailable 
for RNA polymerase binding; under heat stress, mis- 
folded proteins have higher affinity for chaperone 
systems and cr 32 would be released. 67 This posttransla- 
tional regulation has not been investigated in rhizobia, 
nevertheless the fact that no rpoH induction was 
detected under heat stress conditions is consistent 
with the proposed model. 

3.6. Nodulation and nitrogen fixation genes 

Some of the genes involved in nodulation and nitrogen 
fixation were detected to be differentially expressed 
after heat shock. Several fix genes showed severe under- 
expression, especially fixK, which encodes a transcrip- 
tional regulator and was the most underexpressed gene 
following the heat stress (Supplementary Table S2). 
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The FixK is an activator for several operons, namely 
fiXNOQP and fixGHIS and the fixK gene is upregulated 
by micro-oxic conditions. 68 MAFF303099 genome 
encodes two fixNOPQ operons (encoding cytochrome 
oxidases), one located in the symbiosis island. 
Interestingly, all the fix genes found to be underex- 
pressed (fixK, fixj, fixS, fixl, fixP, fixO, fixN) are outside 
the symbiosis island. Uchiumi and collaborators 69 sug- 
gested that rhizobia might have acquired a housekeep- 
ing fixNOPQ operon before the acquisition of the 
symbiosis island. Similar to the present study,/;x genes 
were previously detected to be underexpressed after a 
heat shock in S. meliloti, 5 so a downregulation of the 
fixK cascade upon high-temperature conditions seems 
to be consistent. In S. meliloti, fixKls negatively regulated 
by the activity of fixT; however, no//xTgene is annotated 
in MAFF303099 genome (the most similar gene is 
msl5852, which is not differentially expressed in the 
present study). 

From the high numberof nodulation genesencoded in 
theM. /ot/ MAFF303099 genome (>40 genes) only 1 1 
showed altered transcript levels after the heat shock. 
The genes node and nodE were heat induced, while nine 
other nodulation genes were underexpressed. Only 
nodL was previously reported to be underexpressed fol- 
lowing heat shock conditions in S. meliloti 5,27 but this 
gene expression remained unaltered in the present study. 

3.7. Other heat shock-inducible genes 

Among the 50 genes with the highest /Vl-values 
(Table 2), there are five transcriptional regulators, one 
sigma factor and one anti-sigma factor, which indicates 
that heat shock response is a complex system with rele- 
vant control at the transcriptional level. 

Additional analysis of all hypothetical proteins differ- 
entially expressed performed in this study, allowed 
further characterization of many genes, for example 
mll4607, which is now annotated as Ku protein 
(Table 2). Together with LigD this protein is involved 
in DNA repair, namely in the repairof non-homologous 
end-joining of double-strand DNA. 70 Unlike other bac- 
teria, rhizobial genomes encode multiple copies of this 
Ku/LigD system, which has been further studied in S. 
meliloti. 7 ' 1 Although none of the ku homologues is 
required for the symbiosis establishment, this DNA 
repair system is active in both free-living cells and bac- 
teroids. 71 From the four ku homologs in the 
MAFF303099 genome, three are induced by heat 
shock (mll4607, mlr9624 and mlr9623), as well as 
one of the three UgD homologues (mll962 5). Until re- 
cently, double-stranded DNA breaks (DSB) were not 
thought to be a consequence of heat shock; however, 
a recent study, using eukaryotic cells, showed that 
heat shock may in fact induce DSB on certain phases 
of the cell cycle. 72 It is tempting to agree with the 



suggestion from Kobayashi and coworkers 71 that 
these systems do have some role under stress condi- 
tions, such as heat shock. 

Altogether our results suggest that in a large bacterial 
genome, the extensive gene downregulation may be an 
important part of the heat shock response. Although 
the present study has contributed to further knowledge 
on rhizobia stress response, future studies are required 
to understand the role of individual genes and the 
mechanisms regulating these molecular responses. 
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